Appendix A 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched : 



August 13, 2005, 09:51:41 ; Search time 3878.11 Seconds 

(without alignments) 
9982.023 Million cell updates/sec 

US-10-511-270-3 
1017 

1 cgggatccatgctgggcccc tgagctgtctcagaattccg 1017 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



34239544 seqs, 19032134700 residues 



68479088 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : EST: * 

1: gb_estl:* 
2 : gb_est2 : * 
3: gbjitc:* 
4 : gb__est3 : * 
5 : gb_est4 : * 
6 : gb_est5 : * 
7: gb_est6:* 
8: gb_gssl:* 
9: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Result Query 

No. Score Match Length. DB ID 



Description 



1 921.2 90.6 1635 3 AK002457 

2 908.6 89.3 1596 3 AK010857 



AK002457 Mus muscu 
AK010857 Mus muscu 



ALIGNMENTS 




RES! 

102457 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 



AK002457 1635 bp mRNA linear HTC 03-APR-2004 

ml musculus adult male kidney cDNA, RIKEN full-length enriched 
library, clone: 0610010D20 product hypothetical Amino a^-^ansfer 
RNA synthetases class-II/Dihydrodipicolinate synthetase containing 
protein, full insert sequence. 
AK0024S7 

AK002457.1 GI : 12832454 

HTC; CAP trapper. 

Mus musculus (house mouse) 

Eukaryota^Metazoa; Chordata; Craniata^Vertebrata; Euteleostomi; 
j^alia? Eutheria; Rodentia; Sciurognathi ; Muridae; Munnae; Mus. 



9 

Cai 



Carninci,P. and Hayashizaki , Y . 
High-efficiency full-length cDNA cloning 
Meth. Enzymol. 303, 1 9-44 (1999 ) 
99279253 ~ 
10349636 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata K 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. an * Ha y a * h ^*^ ' 
Normalization and subtraction of cap-trapper-selected cDNAs to 
Sre SlMSgth cDNA libraries for rapid discovery of new genes 
Genome Res. 10 (10), 1617-1630 (2000) 
20499374 
11042159 



FEATURES Location/Qualifiers 
source 1. .1635 

/organism^ "Mus musculus" 

/mol type="mRNA n 

/strain="C57BL/6J" 

/db_xref="FANTOM_DB:0610010D20 n 

/db_xref="taxon: 10090" 

/clone= ,, 0610010D20" 

/sex= a male" 

/tissue type=? kidney" 

/clone_Tib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage= "adult ■ 
CDS 68. .1033 

/not e= "unnamed protein product; hypothetical 
Aminoacyl -transfer RNA synthetases 

class- II/Dihydrodipicolinate synthetase containing protein 

(InterPro| IPR002106, InterPro| IPR002220, evidence: 

InterPro) 

putative" 

/ codon_s t a rt a 1 

/protein_id="BAB22114 . 1" 

/db_xref="GI: 12832455" 

/translations "MIX3PQIWASMRQGLSRGLSRNVTGGKKVDIAGIYPPVTTPFTATA 
EVDYGKLEENLNRLAT F P FRG FWQGS TGE P P FLT S LERLEWS R VRQ AI P KDKFL I A 
GSGCESTQATVBMTVSMAQVGADVAMVVTPCYYRGRMSSAALIHHYTKVADVSPIPVV 
LYSvTANTGLELPvT)AvVrLSQHPNIIGLKDSGGDVTRIGLIvlIKTSKQDFQvXAGSA 
GFLLAS YAVGAVGGI CGLANVliGAQ VCQLERLCLTGQWEAAQELQHRL I E PNTAVTRR 

|s FGI PGLKKTMDWFGYYGGPCRAPLQELS PTE EEALRLD FSNNGWL " 

IgRlGIN 

■fery Match 90.6%; Score 921.2; DB 3; Length 1635; 

?5Kte° cal Similarity 94.7%; Pred. No. 1.8e-234; 



Matches 953; Conservative 0; Mismatches 53; Indels 0; Gaps 0; 



Qy 


9 


ATGCTGGGCCCCCAAATCTGGGCCTCCATGAGGCAGGGGCTGAGCAGGGGCTTGTCTAGG 


68 




II IHIIII j| j II I 1 | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
i i i 1 i i 1 i i i i i i i i i i i i i i i i i i i i i i i i \ i i i i i i i i i i i i ii i i i i i i i ii i i i 




Db 


68 


ATGCTGGGCCCCCAAATTTGGGCCTCCATGAGGCAGGGTCTGAGCAGGGGCTTGTCTAGG 


127 


Qy 


69 


AACGTGAAGGGGAAGAAGAT AGACATTGC CXX3CATCT AC CCAC CCGTGAC CAC CCCATTC 


128 




i 1 1 1 1 1 1 1 I 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

i i ii i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 i i i i i i 




Db 


128 


AATGTGAAGGGCAAGAAGGTAGACATTGCCGGC^TCTACCCACCCGTGACCACCCCATTC 


187 


Qy 


129 


ACCGCCACCGCAGAAGTAGACTATGGGAAACTGGAAGAGAACCTGAACAAACT^ 


188 




I II 1 1 1 1 I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IHIIII II 
i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i r i i i i t i i i i i i ii 




Db 


188 


ACCGC(^CCGCAGAGGTAGACTATGGGAAACTGGAAGAGAACCTGAACAGACTGGCCACC 


247 


Qy 


189 


TTCCCCTTTCGAGGCTTCGTGGTCCAGGGCTCTACTGGAGAGTTTCCATTCCTGACCAGC 


248 




1 1 1 i 1 1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 f 

l l 1 I l l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 




Db 


248 


TTCCCCTTTCGAGGCTTCGTGGTCCAGGGTTCGACTGGAGAGTTTCCGTTCCTGACCAGC 


307 


Qy 


249 


0mrr+w0+0m m\ />t M M /<i AM « 0m m jn MnkiiiMMiA * 0m 0m 0m 0m m 0m\ m 0m 0m 0m 0m 0m mi 0m 0m 0m 0m\ » m « 0m,0m ^ 0m » m 0^ 0m+mm 0m 0mtw% 0m 

CTTGAGCGCCTAGAGGTGGTGAGCCGAGTGCGCCAGGCCATACCCAAGGACAAGCTCCTG 


308 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 ! 1 1 1 1 1 1 
11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


308 


CTCGAGCGCCTGGAGGTGGTGAGCCGCGTGCGCCAGGCCATACCCAAGGACAAGTTCCTG 


367 


Qy 


309 


ATAGCCGGCTCTGGCTGCGAGTCCA.CGCAAGCCA.CAGTAGAGATGACTGTCAGCATGGCT 


368 




1 1 1 1 ! 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 




Db 


368 


AT AGC CGGCT CTGGCTGCGAGTC CACGCAAG C CACAGT AGAGATGACTGT CAGCATGGCT 


427 


Qy 


369 


CAGGTGGGTGCTGATGCCGCCATGGTGGTGACCCCTTGTTACTATCGCGGCCGCATGAAC 


428 




illlllllllllllll llllllllllllllllllllllllllllll llllllllll 1 




Db 


428 


CAGGTGGGTGCTGATGTCGCCATGGTGGTGACCCCTTGTTACTATCGTGGCCGCATGAGC 


487 


Qy 


429 


AG CGCTGC C CT CATTCAC CACT ACAC CAAGGTTG CTGATC TTT CT C CAAT CCCGGTGGTG 


488 




IMIIIIMIIIIIIIIIIIIIMIIIIIIMIIMM III llllllllll MINI 




Db 


486 


AGCGCTGCCCTCATTCACCACTACACCAAG^TTGCTGACGTTTCTCCAATCCCTGTGGTG 


547 


Qy 


489 


CTGTA(^GTGTCC(^GGCAA(^CGGGTCTAGAGCrGCCTGTGGATGCCGTGGTCACATTG 


548 




11 iiiiliiiii 111 111 111 11 11 111 111 111111 11 11 11 11 111 mi 11 




Db 


548 


TTGTACAt^TCCCAGCCAATACGGGGCTAGAGCTACCTGTGGATGCCGTG^TTAC^TTC 


607 


Qy 


549 


T CT CAG CAC CCAAAT ATCATTGGCTTGAAGGACAGTGGTGGAGATGTGAC CAGGACTGGG 


608 




III II II Mill II II IN Mill II II Mill Mill 1 II III II II II II 1 111 




Db 


608 


T CTCAGCAC C CAAAT AT CAT CGGCTTGAAGGACAGTGGTGGAGATGTGAC CAGGATTGGA 


667 


Qy 


609 


CTGATTGTTCACAAGACCAGCAAGCAGGATTTCCAGGTGTTGGCTGGGTCAGTTGGCTTC 


668 




Mill llllllllll III 1 IMIIIIIIIIIIIIIIIIIMMMI II MIIMI 




Db 


668 


CTGATAGTTCACAAGACCAGCAAGCAGGATTTCCAGGTGTTGGCTGGGTCAGCTGGCTTC 


727 


Qy 


669 


CTCCTGGCC^GCTATGCTGTGGGAGCTGTTGGGGGCATATGTGGCCTGGCCAATGTCTT^ 


728 




IIIMMIM II MM II Mil III III MIMIMMIMI II MM llllllllll 




Db 


728 


CTCCTGGCCAGCTATGCTGTGGGAGCTGTTGGGGGCATATGTGG 


787 


Qy 


729 


GGGGCCCAGGTCTGCCAGCTGGAGAGACTCTGCCTCAC^ 


788 




M MM 11 1 1 Ml IMIMI II M 1 II II M 1 II II IMM II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


788 


GGGGCCCAGGTGTGCCAGCTGGAGAGACTCTGCCTCACAC<X5CAGTGGGAAGCTGCCCAG 


847 


Qy 


789 


AGACTGCAGCACCGCCTCATCGAGCCC^CACTGCGGTGACCCGGCGCTTTGGAATAC^ 


848 




III MIIIMI MIIIMIMMIMIIIIMIIIIIIMMMIIIIIIIIIIII 




Db 


848 


GAACTACAGCACCGTCTCATCGAGCCCAACACTGCGGTGACCCGGCGCTTTGGAATACCA 


907 


Qy 


849 


GGGCTGAAGAAAACCATGGACTGGTTTGGCTACTATGGAGGTCCCTC 


908 




II 1 IMMII II M MMMI MUM! Ml II II 1 Mill 1 II Mil 1 1 Mill III 




Db 


908 


C^CTGAAGAAAACCATGGACTC^TTTGGCTACTATGGAGGTCCCTGCCGCGCCCCGTTG 


967 


Qy 


909 


GAGGAGTTGAGCCCCTCAGAGGAAGAGGCGCTTCGCIT^ 


968 




llllll MIIIMI IMIMI Mill II 1 1 1 1 M 1 1 II 1 1 1 II 1 II 1 1 II II III 




Db 


968 


CAGGAGCTGAGCCCCACAGAGGAGGAGGCACTGCGCITGG^ 


1027 


Qy 


969 


CTTTAATGACAAGCGGGGGACACCTGGTCTGAGC^ 1014 






MIMIMMMII II ill Mill IMIMI Ml II il 




Db 


1028 


CITTAATGACAAGCAGGAGACGCCTGGCCTGAGCT 1073 





LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 



AK010857 1596 bp mRNA linear HTC 03-APR-2004 

Mus musculus 13 days embryo liver cDNA, RIKEN full-length enriched 
^ rary 'u Clone:2500002N04 Product: hypothetical Aminoacyl- transfer 

^ t ^ th f?r 8 clM »-"/Why^odlpiSlinate synthetasTcoSalnfng 
protein, full insert sequence. y 
AK0108S7 

AK010857.1 GI: 12846588 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

SJSfi?^ a ?° rda ? a; Cr aniata; Vertebrata; Euteleostorai ; 
Mammalia; Eutheria; Rodentia; Sciurognathi , Muridae; Murinae? Mus. 

^e£rninci,P. and Hayashizaki, Y. 
High-efficiency full-length cDNA cloning 
Meth. Enzymol. 303, 19-44 (1999) 
99279253 

$"49636 

SS^ 01 '*^ Sh A bata / Y - " a y at8u ' N - Sugahara,Y., Shibata,K., 
N^i; a f? ' V Okazaki Y., Muramatsu,M. and Hayashizaki, Y. 
Normalization and subtraction of cap- trapper- selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new qenes 
Genome Res. 10 (10), 1617-1630 (2000) 9 

20499374 l> 

11042159 



FEATURES 

source 



CDS 



ORIGIN 



Location/Qualifiers - 
1. .1596 

/organisms "Mus musculus 0 

/mol_type= n mRNA" 

/ strain= "C57BL/6 J" 

/db_xre f =" FANTOM_DB : 2 5 0 0 0 0 2N04 " 

/db_xref = " t axon : 1 0 0 9 0 " 

/clone="2500002N04 " 

/tissue type= n liver" 

/clone__Tib= "RIKEN full-length enriched mouse cDNA library" 
/dev_stage="13 days embryo" 
28. .1020 

/note= "unnamed protein product; hypothetical 
Aminoacyl -transfer RNA synthetases 

class-II/Dihydrodipicolinate synthetase containing protein 

(InterPro| IPR002106, InterPro | IPR002220, evidence: 

InterPro) 

putative" 

/codon start=l 

/proteTn_id="BAB27226.1" 

/db_xref="GI: 12846589" 

, ' / translations "MLGPQIWASMROXSLSRGLSRNVKGMKVDIAGIYPPVTTPFTATA 
* t EVDYGKLEENLNRLATFPFRGFWQGSTGEFPFLTSLERLEVVSRVRQAIPKDKFLIA 
^ GSGCESTQATVEMTVSMAQVGADVAMWTPCYYRGRMSSAALIHHYTKVADVSPI PW 
LYSVPANTGLELPVDAWTLSQHPNI IGLKDSGGDVTRIGLIVHKTSKQDFQVLAGSA 
GFLLAS YAVGAVGGI CX3LANVLGAQ VCQLERLCLTGQWEAAQELQHRL I E PKHCGDPA 
LWNTRAE ENHGLVWLLWRS L PR PVAGAE PHRGGGT ALG FQQQWLALMTS RRRLA " 



Query Match 



89.3%; 



Best Local Similarity 94.5%; 
Matches 952; Conservative 



Score 908.6; DB 3; 
Pred. No. 4.2e-231; 
0; Mismatches 54; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



Length 1596; 
Indels 1; Gaps 



If 



9 ATGCTGGGCCCCCAAATCTGGGCCTCCATGAGGCAGGGGCTGAGC^GGGGCTTGTCTAGG 68 

„ I II lllll II I II I III II I MM MM II II III II llll II II II Mill II II II 

28 ATGCTGGGCCCCCAAATTTGGGCCTCCATGAGGCAGGGTCTGAGCAGGGGCTTGTCTAGG 87 



69 



AACGTGAAGGGGAAGAAGATAGACATTGCCGGCATCTACCCACCCGTGACCACCCCATTC 128 

M llllllll I llll IIIIIIMMMMMMIMMIMIIIIMIIMIMM 

8 8 AATGTGAAGGGCATGAAGGTAGACATTGCCGGCATCTACCCACCCGTGACCACCCCATTC 



129 



147 



188 



ACCGCavCCGCAGAAGTAGACTATGGGAAACTGGAAGAGAACCTGAACAAACTGGCCGCC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II II 1 1 1 1 II II II II I II II 1 1 II 1 1 M 1 1 1 1 lllllll II 

148 ACCGCCACCGCAGAGGTAGACTATGGGAAACTGGAAGAGAACCTGAA 207 
189 TTCCCCTTTCTGAGGCTTCX3TGGTCCAGGGCTCTACTGGAGAGTTTCCIA 248 

M I f 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 i II IMIMMMIMI MIMIIIIMI 

208 TTCCCCTTTCGAGGCTTCGTGGTCCAGGGTTCGACTGGAGAGTTTCCGTTCCTGACCAGC 267 



249 



308 



CTTGAGCGCCTAGAGGTGGTGAGCCGAGTGCGCCAGGCCATACC(^GGACAAGCTCCTG 

o« UJJUUUJ 1 1 1 1 1 M u ii 1 1 iiiiiMiiiiiiiiiiiiiiiiiiii Mm 

268 CTCGAGCGCCTGGAGGTGGTGAGCCGCGTGCGCCAGGCCATACCCAAGGACAAGTTCCTG 327 



309 



368 



ATAGCCGGCTCTGGCTGCGAGTCCACGCAAGCCACAGTAGAGATGACTGTCAGCATGGCT 

, 1 1 M Mill lllllll I II lllllll lllll llllllll II II III II 1 1 llllll llll 

328 ATAGCCGGCTCTGGCTGCGAGTCCACGCAAGCCACAGTAGAGATGACTG 387 



369 



428 



CAGGTGGGTGCTGATGCCGCCATGGTGGTGACCCCTTGTTACTATCGCG^ 

Mill MINI Mill MIMIIMIIMMMMMMMIMM II lllll III i 

388 CAGGTGGGTGCTGATGT CG CCATGGTGGTGAC C CCTTGTT ACTAT CGTGGCCGCATGAGC 447 



489 
508 
549 
568 
609 
628 



429 4 -- 

448 iAWi^ 5Q7 

CTGATAGTTCACAAGACCAGCAAGCAGG^ 68? 

669 ffCCTGGCCA^^ 728 

729 G^CCCA^ 

AGGGCTGAAGAAAAC^ 



848 
868 
908 
928 



747 
788 
807 
847 
867 
907 



968 GCTyTAATGA^ 1Q 

coo JJJJIi'JJ 111 ' 11 I I II II I III II II 

988 GCTTTAATGACAAGCAGGAGACGCCT^ 1034 



